STATISTICAL REMULTIPLEXING WITH BANDWIDTH ALLOCATION 
AMONG DIFFERENT TRANSCODING CHANNELS 



BACKGROUND OF THE INVENTION 

The present invention relates to a statistical 
remultiplexer for transcoding digital video signals. 

Commonly, it is necessary to adjust a bit rate of 
digital video programs that are provided, e.g., to 
subscriber terminals in a cable television network or 
the like. For example, a first group of signals may be 
received at a headend via a satellite transmission. 
The headend operator may desire to forward selected 
programs to the subscribers while adding programs 
(e.g., commercials or other content) from a local 
source, such as storage media or a local live feed. 
Additionally, it is often necessary to provide the 
programs within an overall available channel bandwidth. 

Accordingly, the statistical remultiplexer (stat 
remux) , or multi -channel transcoder, which handles pre- 
compressed video bit streams by re -compressing them at 
a specified bit rate, has been developed. 

In such systems, a number of channels of data are 
processed by processors arranged in parallel. Each 
processor typically can accommodate multiple channels 
of data. Although, in some cases, such as for HDTV, 
which require many computations, portions of data from 
a single channel may be allocated among multiple 
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processors. Moreover, typically a fixed transcoding 
bandwidth is allocated to one or more groups of 
channels (stat remux groups) . 

However, there is a need for an improved stat 
5 remux system that provides a bit rate need parameter 

for each channel to enable bits to be allocated for 
transcoding the channels in a manner that optimizes the 
image quality of the coded data, while still meeting 
the constraints of a limited throughput. 

10 The system should estimate the bit rate need 

parameter from statistical information that is derived 
from the bitstream, such as a frame bit count and 
average quantizer scale values of the original 
bitstream. The system should be compatible with MPEG-2 

15 bitstreams. The system should allocate a target output 

frame bit count for I, P and B frames based on the 
coding complexity estimated from the statistical 
information of the original bit stream. 

Moreover, the system should accommodate MPEG-2 

20 macroblock processing within a frame, by using a 

macroblock bit count and quantizer scale values of the 
original bit stream to guide the rate control process 
to meet the target frame bit count at the output. 

The system should provide periodic adjustments of 

25 an allocated transcoding bit rate a number of times in 

a video frame . 

Additionally, the system should derive quantizer 
scale values for transcoding macroblocks in a frame 
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based on original, pre -transcoding quantizer scale 
values. The quantizer scale values should be adjusted 
as transcoding of a frame proceeds to ensure that each 
macroblock is allocated a minimum number of bits for 
5 transcoding. 

The present invention provides a system having the 
above and other advantages . 
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SUMMARY OF THE INVENTION 

The present invention relates to a statistical 
remultiplexer for transcoding digital video signals. 
In one aspect of the invention, a bit rate need 
5 parameter is estimated for statistical re-multiplexing 

from a frame bit count and average macroblock quantizer 
scale values (averaged over a frame) of an original 
bitstream, such as an MPEG-2 bitstream. A lookahead 
of, e.g., five-frames is provided. 
10 The invention allocates a total available 

bandwidth among the transcoding channels. 

The invention allocates a target of output frame 
bit count for I, P and B frames based on the coding 
complexity estimated from the frame bit count and the 
15 average macroblock quantizer scale (averaged over each 

input frame) of the original bit stream. 

Furthermore, in another aspect of the invention, 
during MPEG- 2 macroblock processing within a frame, a 
macroblock bit count and quantizer scale value of the 
20 original bit stream are used to guide the rate control 

process to meet the target frame bit count at the 
output . 

Thus, the present invention provides an efficient 
statistical remultiplexer for processing data in a 
25 number of channels that include video data. In one 

aspect of the invention, transcoding of the video data 
is delayed while statistical information is obtained 
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from the data. Bit rate need parameters for the data 
are determined based on the statistical information, 
and the video data is transcoded based on the 
respective bit rate need parameters following the 
5 delay. 

In another aspect of the invention, a transcoding 
bit rate for video frames at the stat remux is updated 
a plurality of times at successive intervals to allow a 
_ closer monitoring of the bit rate. Moreover, minimum 

^ 10 and maximum bounds for the transcoding bit rate are 

updated in each interval. Thus, a portion of a frame 

^ is transcoded in a first interval, then the transcoding 

Ln 

bit rate is updated, then a second portion of the frame 
""J is transcoded in a second interval, then the 

p 15 transcoding bit rate is updated again, and so forth. 

J: In yet another aspect of the invention, the pre- 

tf! transcoding quantization scales of the macroblocks in a 

g frame are scaled to provide corresponding new 

quantization scales for transcoding based on a ratio of 
2 0 a pre -transcoding amount of data in the frame and a 

target, post -transcoding amount of data for the frame. 

Moreover, the quantization scales are adjusted for 

different portions of the frame as the portions are 

transcoded to ensure that a minimum amount of 
25 transcoding bandwidth is allocated to each macroblock. 

Corresponding methods and apparatuses are 

presented. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 illustrates a stat remux, and a data flow 
into and out of a quantization level processor (QLP) , 
in accordance with the present invention. 
5 FIG. 2 illustrates a simplified transcoder for use 

in accordance with the present invention. 

FIG. 3 illustrates a transcoder that performs 
requantization without motion compensation for use in 
accordance with the present invention. 
10 FIG. 4 illustrates an end-to-end stat remux 

processing delay in accordance with the present 
invention . 

FIG. 5 illustrates a transcoder video buffering 
verifier (VBV) model in accordance with the present 
15 invention. 

FIG. 6 illustrates transcoder rate timing in 
accordance with the present invention. 

FIG. 7 illustrates communication timing between a 
QLP and transcoder processing elements (TPEs) in 
20 accordance with the present invention. 



DETAILED DESCRIPTION OF THE INVENTION 



The present invention relates to a statistical 
remultiplexer for transcoding digital video signals. 
The following acronyms and terms are used: 
BW - Bandwidth 

DCT - Discrete Cosine Transform 

DTS - Decoding Time Stamp 

ES - Elementary Stream 

FIFO - First-In, First-Out 

KP - Kernel Processor 

MTS - MPEG Transport Stream 

PCI - Peripheral Component Interconnect 

PCR - Program Clock Reference 

PES - Packetized Elementary Stream 

PID - Program Identifier 

Q - Quantization 

QLP - Quantization Level Processor 

SDRAM - Static Dynamic Random Access Memory 

TP - Transport Packet 

TPE - Transcoder core Processing Element 
VLD - Variable-Length Decoding 
VLE - Variable -Length Encoding 

FIG. 1 illustrates a stat remux, and a data flow 
into and out of a QLP, in accordance with the present 
invention. 

The stat remux 100 includes a groomer 105 for 
receiving a number of input transport streams. 
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Corresponding input transport packets in different 
video services are provided to one of a number of 
transcoders 110, 112, or TPEs (transcoding engines). 
Typically, each transcoder can handle one or more video 
services (channels) . Transcoded data is provided via a 
PCI bus 115 to a multiplexer (mux) 120, which assembles 
a corresponding output transport stream. 

A Kernel Processor (KP) configures the groomer 
105, the TPEs 110, 112, the QLP 130, and the Mux 
120. 

In particular, the mux 12 0 is responsive to a 
transmission bit rate provided from a QLP 13 0, which 
has a memory 132 such as a SDRAM. The QLP may be 
implemented using a media processor, such as the 
MAPCA2000 (300MH2) media processor from Equator 
Technologies, Inc. The QLP 13 0 performs the following 
functions : 

o Allocates an available bandwidth to the output 
video services to optimize the video quality and 
determine the target frame size for each frame to 
be transcoded. 

© Receives configuration parameters from the Kernel 
Processor. 

o Reports operational status and statistics back to 
the Kernel Processor. 

The QLP communicates with the KP and TPEs via the 
PCI bus (32bit ® 66MHz ) . A block of memory is 
allocated on an SDRAM of the QLP for interprocessor 
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communication. This memory block is "shared" by the 
QLP with other processors. 
Ipputg to QLP 3.3 Q 

• Configuration parameters and commands (Source: KP 
5 140) 

• Video and associated audio and data input packet 
rate information (Source: TPEs 110, 112) 

• Statistics of the input frame to be transcoded 
(Source: TPEs) 

10 • Timing information of the input frame (Source: 

TPEs) 

• Statistics of the output frame just transcoded 
(Source: TPEs) 

• Timing information of the output frame (Source: 
15 TPEs) 

• Transcoder FIFO level (Source TPEs) 

• Non-video (data) input packet rate information 
(Source: Mux 120) 

Outputs fyom QJ^P 130 
20 • Status and statistics (Destination: KP) 

• TPE service assignment information (Destination: 
KP) 

• Transmission bit rate (Destination: Miax) 

• Transcoding target frame size (Destination: TPE) 
25 • Maximum and Minimum frame size for buffer 

protection (Destination: TPE) 
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o Minimum number of PCRs to be inserted into a frame 
(Destination: TPE) 

o Flag to command the TPE to passthrough a frame 
(Destination: TPE) 



5 1 . Overview 

Although the transcoder 100 is not necessarily 
decoding and re-encoding the video stream, the 

^ transcoding function can emulate a full decode and re- 

==? 

O encode. The rate control and stat-remux system in 

01 

m 10 accordance with the invention is summarized in the 

^ following steps. Details are described in the next 

1=^ sections. 

1. Each TPE 110, 112 inputs the transport 
y stream of every video channel it is processing. The 

fU 15 transport stream is then unpacketized and a video 

~ decoder buffer is emulated. A lookahead buffer is used 

O by each TPE to store a number of future frames and 

obtain statistical information from these frames. In 
particular, for every input frame to be transcoded, the 
20 TPE computes the average quantizer scales values and 

the number of bits in the input frame. These 
parameters are used by the QLP 13 0 to calculate the bit 
rate need parameter for the input frame at a scheduled 
time that is at least 1.5 NTSC frame times before it is 
25 that frame's turn to be transcoded. In particular, 

because of the possible use of the 3:2 pulldown format 
in the video channels, a coded frame can be either one 
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NTSC frame time (33.3ms) or 1.5 frame times (50ms). We 
use the longer time to make sure the transcoding rate 
allocation for the frame is determined before the 
actual transcoding begins. 
5 2. The QLP 130 performs a bandwidth allocation 

process to allocate a transcoding bit rate to the TPEs 
at periodic intervals, Tq. It computes the transcoding 
bit rate for every video channel for each Tq interval . 
The transcoding bit rate is stored in a queue at the 
''^ 10 QLP 130 and delayed for (0.5 sec + 3 NTSC frame 

^ periods) = 0.6 sec, rounded to the nearest Tq period. 

One NTSC frame period is 1/30 sec. The delayed value 
of the transcoding bit rate of the video channel 
"^J becomes the transmission bit rate, and is used by the 

p 15 Mux 120. 

•-n 

r; 3. While a frame is in the lookahead buffer of a 

in TPE, the average transcoding bit rate over the frame is 

g used to derive an initial value of a target frame size, 

which is a predicted size of the frame after 
20 transcoding. This initial target frame size value is 

stored in an output frame size queue (e.g., in memory 
132) of the QLP 13 0, and is retrieved when the 
associated TPE is ready to transcode the frame. Queues 
may be implemented by the QLP in the memory 132. 
25 4. When the transcoder is ready to transcode a 

new frame, the initial target frame size value that was 
previously determined is retrieved from the output 
frame size queue. Based on the current state 
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(fullness) of the transcoder buffer, the maximum and 
minimum frame size to protect the decoder buffer from 
underflow or overflow are calculated for bounding the 
initial target frame size. 

5. If the number of target bits (target frame 
size) is greater than (or close to, within a 
predetermined tolerance - see section 6.3) the number 
of bits in the input frame, this frame bypasses 
transcoding in a passthrough mode since the purpose of 
transcoding is to reduce the number of bits in a frame. 
This situation may occur when the input bitstream is 
already heavily compressed. When a frame is bypassed, 
the associated input elementary stream is re- 
packetized. If the number of target bits is smaller 
than the number of bits in the input frame, the frame 
is bit-reduced (transcoded) . Bit reduction may be 
performed, e.g., either through a simplified transcoder 
architecture (FIG. 2) or re-quantization (FIG. 3) . 

6. The quantizer scale for each macroblock (or 
group of macroblocks) for transcoding is chosen based 
on the number of target bits per frame, and the 
original quantizer scale. The condition that the 
output quantization scale is higher (coarser) than the 
input quantization scale must be met. 

7. During transcoding, the TPEs have to allocate 
certain slots for a PGR field in the packets they 
output. It is important to avoid allocating more slots 
than necessary since this wastes bits. Thus, the 
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outgoing packets at a TPE are created and stored in 
memory, e.g., in a TPE FIFO buffer. Moreover, the QLP 
13 0 uses the target frame size to estimate the time 
used for transmitting the frame, and hence the time for 
inserting the PCRs . 

Moreover, a PGR slot is created at least every 0.1 
second to conform to the MPEG2 system standard 
requirement . 

8. At each n*Tq period, the Mux 120 reads the 
number of packets assigned to each channel from the 
TPEs 110, 112 via the PCI bus 115. "n" is a design 
parameter for the Mux 120 and can be any positive 
integer. This packet assignment is equivalent to the 
transmission bit rate allocation, which is in turn a 
delayed version of the transcoding bit rate allocation. 
That is, the bit rate is converted to a number of 
packets to send to the Mux per Tq period. 

9. The Mux 120 receives a transport tick every m 
ticks of a 27MHz clock. If the packet to transmit 
contains a PGR, the Mux performs PGR correction to 
provide a PGR value that is properly synchronized with 
a master clock of the transcoder. This may achieved as 
described in commonly-assigned, co-pending U.S. Patent 

Application No. to R. Nemiroff, V. Liu and S. 

Wu, filed on , and entitled "Regeneration Of 

Program Glock Reference Data For Mpeg Transport 
Streams." The transport packet (s) are sent out the Mux 
Processor via the PGI bus 115. The transport tick 
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refers to the timing interval for outputting a 
transport packet. 

FIG. 2 illustrates a simplified transcoder for use 
in accordance with the present invention. 

While a straightforward transcoder can simply be a 
cascaded MPEG decoder and encoder, the transcoder 200 
provides a simplified design that reduces computations. 
The transcoder architecture 200 performs most 
operations in the DCT domain, so both the number of 
inverse-DCT and motion compensation operations are 
reduced. Moreover, since the motion vectors are not 
recalculated, the required computations are 
dramatically reduced. This simplified architecture 
offers a good combination of both low computation 
complexity and high flexibility. 

In particular, a pre-compressed video bitstream is 
input to a Variable Length Decoder (VLD) 215. A 
dequantizer function (inverse quantizer) 220 processes 
the output of the VLD 215 using a first quantization 
step size, Qi. 

Motion vector (MV) data is provided from the VLD 
215 to a motion compensation function 235, which is 
responsive to a previous frame buffer 250 and/or a 
current frame buffer 245 of pixel domain data. A DCT 
function 270 converts the output of the MC function 235 
to the frequency domain and provides the result to an 
adder 230. A switch 231 passes either the output of 
the adder 230 or the Qi"^ function 220 to a quantization 
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function Q2 275, which quantizes the data, typically at 
a coarser level to reduce the bit rate. This output 
is then inverse quantized at an inverse quantization 
function 02"^ 282 for summing at an adder 286 with the 
5 output of the switch 231. The output of the adder 286 

is provided to an IDCT function 284, and the output 
thereof is provided to the frame buffers 245 and 250. 

A Variable Length Encoder (VLE) 280 codes the 
output of the Quantization function 275 to provide at 

D 

10 output bitstream at a reduced bit rate. The bit output 

^ rate of the transcoder is thus adjusted by changing Q2. 

\0 FIG. 3 illustrates a transcoder 3 00 that performs 

iJl ... 

requantization without motion compensation for use in 

s 

accordance with the present invention, 
p 15 Here, only re -quantization is applied to a frame, 

without motion compensation. Generally, IDCT and DCT 

I y 

m operations are avoided. This strategy incurs a lower 

Q 

□ complexity, but causes some artifacts in the output 

data. The DCT coefficients are de-quantized then re- 
20 quantized. 

In particular, a VLD 410, inverse quantizer 420, 
quantizer 430 and VLE 440 are used. 



2. End-to-end Processing Delay 

FIG. 4 illustrates an end-to-end stat remux 
25 processing delay in accordance with the present 

invention. 



An example one of the transcoders or TPEs 110 
includes an MTS buffer 405 for buffering the input 
transport stream, a demux 410 for separating out the 
elementary streams of the different services in the 
transport stream, and an ES buffer 415 for storing the 
ESs streams. The ES data is variable- length decoded at 
a VLD function 42 0, and the result is provided to a 
lookahead delay buffer 425-, with a capacity of, e.g., 
five frames. After a one-frame delay at a buffer 435, 
a frame is transcoded at a transcode function 440, and 
the result is stored in a transcode buffer 445. A 
remultiplexer (remux) 450 combines data from the 
transcode buffer 445 and data, if present, from a 
transport stream delay buffer 43 0, and the resulting 
transport stream is communicated to a decoder 452, such 
as a set -top box in a broadband communication network. 
The transport stream delay buffer 43 0 is used for the 
bypass frames, discussed previously, that are not 
transcoded. The bypass frames are delayed to maintain 
synchronicity with the other channels that are 
transcoded. 

Note that, in practice, the output stream from the 
transcoder 110 is combined with other transport steams 
from the other transcoders to form a transport 
multiplex that is communicated to a representative 
decoder 452. The decoder 452 includes a FIFO buffer 
455 that buffers the incoming data, and a decoding 
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function 460 that decodes the data to provide an 
output, e.g., for display on a television. 

A buffer delay of, e.g., 0.5 seconds, which can 
vary in different implementations, is experienced by 
5 the video packets. This is a delay between the 

transcoding (encoding) time and the decode time. This 
delay occurs in both the transcoder (output) video FIFO 
445 and the decoder FIFO 455. The buffer delay is 
fixed. If the transcoding time is delayed, the actual 
10 transcode time to decode time is shortened, but the 

transcode 'tick' to decode time is fixed by the buffer 
delay. 



3. Buffer Model 

FIG. 5 illustrates a transcoder VBV model in 
15 accordance with the present invention. 

The vbv model of the decoder 452 is used to limit 
the maximum and minimum frame size before transcoding a 
new frame. The level of the transcoder' s bitstream 
output FIFO can be used to derive the decode buffer 
20 status just before the DTS of the new frame (see also 

FIG. 7) . Specifically, the future decode buffer status 
{vbv_fullness) is given by 

vbv_fullness = (No. of bits to be transmitted 
from current time to the DTS time of the new frame) - 
25 (No. of bits in the encoder FIFO ) . The vjbv__f ullness 

calculation is shown in FIG. 5, where the composition 
of the transcoder FIFO is shown at 500, and the 
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corresponding composition of the decoder FIFO is shown 
at 550. 

Moreover, we can compute the number of bits 
transmitted by adding all the encoding rates starting 
5 from t seconds up to the last encoding rate issued by 

the QLP, (where t is the total delay through the encode 
plus decode buffers, i.e., the system delay.) The QLP 
provides the bit rate in a number of packets to output 
_ for the time period Tq. 

.J 10 Moreover, a margin needs to be added due to the 

1: uncertainty caused by the variable latency from the 

time the QLP issues a rate change to the time the new 
T£ rate actually takes effect at the transcoder. The new 

""-^ bit rate is changed at the fixed period Tq. Tq is 

Q 15 asynchronous to the video frame time (DTS of decoder) . 

As shown in FIG. 6, the transcoding rates are 
^ computed at the dashed lines, e.g., 602, 604, 606, ... 

□ This example assumes the system delay is three frames 

and the transmission rate needs to be computed for Pl-1 
20 to Pl-3. With this notation, Pl-1 denotes 

Program (bitstream #1) frame #1, Pl-2 denotes 
Program (bitstream #1) frame #2 ,and so forth. Since 
the frame DTS times do not align with the rate changes, 
this causes a difference between the transcoding rate 
25 and the transmission rate. Moreover, since the Tq 

period straddles two frames, the second (later) frame 
is assigned those packets. 
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The worst case rate error between transcoding and 
transmission is the difference in the number of packets 
allocated at the current time and the number of packets 
assigned a system delay time later (DTS of the current 
5 frame) . The bottom of FIG. 6 shows the two extreme 

cases. In the first case (650), the frame DTS occurs 
just prior to Tq. In the second case 670, the frame DTS 
occurs at Tq". A through AA represent the number of 
packets assigned to each Tq period. Both cases have the 
10 same encoding packet assignment, sum {B through X) , 

The number of transmission packets for case 1 is 
Sum(C,D,E, .„ ,W,X,Y); and, for case 2, Sum( B,C,D, ... 
,V,W,X) . Case 2 has no difference between encoding and 
transmission rates, so this is the best case (DTS 
15 aligned with Tq) . Case 1, which is the worst case, has 

a difference of B packets. 

Therefore, the estimated number of bits to be 
transmitted from the current time to the DTS time is: 
2 transcoding^packets in) ± packet_count_error, 
20 packet_count_error = (number of packets assigned 

at DTS time) - (number of packets assigned at current 
time) . A positive and negative pacJcet_count_error has 
different effects on frame size calculations. 

Note that system delay should be a multiple of Tq. 
25 With the estimated vbv fullness given by 

vjbv__f ullness = (no. of bits to be transmitted) - 
(bits in transcoder FIFO) ; 

this value can be used to limit the trascoded 
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frame size, so it will be no more than vhv_fullness . 
This requirement is imposed to ensure the decoder 
buffer will not underflow while decoding the current 
frame (i.e., the frame that is about to be transcoded) . 
5 The maximum frame size and minimum frame size can 

be derived from the sequence of transmission bit rate, 
snapshot of the transcoder buffer level, and the sizes 
of the previously transcoded frame, as follows: 
^ Let B(t) = Buffer level at time t. 

10 tc = time when the current frame enters the 

'IJ transcoder FIFO. 

to = time when the transcoder FIFO level was 

Zl last read. 

"■^ T(t) = Size of the frame entering the FIFO at 



15 time t. 



R(t) = transmission bit rate at time t. 
dts = DTS of the current frame. 
nextDts = DTS of the next frame 
D = decoder buffer size. 

dts 



20 Maximum Frame Size = ^R{t)-B{tc) 



t^tc 

dts 



t=tO t^tO 



/=/0 /=/0 
dts tc 

= £/?(o-5(/0)-£r(o 

/=/0 t=tO 

nextDts 

Minimum Frame Size = ^R(t) - B{tc) - D 



t=tc 
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nextDis 

= Maximum Frame Size + 

t=dts 

FIG. 7 illustrates communication timing between a 
quantization level processor (QLP) and transcoder 
processing elements (TPEs) in accordance with the 
5 present invention. 

At time 705, a TPE sends statistical information 
for a current frame "N" to the QLP. At time 710, the 
TPE sends information regarding the fullness of the 
TPE ' s output buffer, which includes data from a 

10 previously transcoded frame with an index of N-k, e.g., 
where k=l. The previously coded frame is usually frame 
N-1, i.e., the previous frame. However, sometimes it 
might take more than one frame time to transcode a 
frame so the timing might ''slip". In that case, the 

15 distance between the "current frame" and the frame just 

transcoded may be more than 1 frame, e.g., such that 
k=2 . At time 715, transcoding starts for frame "N" 
using the need parameter calculated from the associated 
statistical information. The transcode bit rate is 

20 calculated for each Tq period, such as at example time 

720 . 

Time 725 denotes the start of transcoding for the 
next frame, with index N-1. 

At an example time 730, the TPE sends information 
25 regarding the fullness of its output buffer, which now 

contains data from frame N, to the QLP. In response, 
the QLP provides a target frame size, and minimum and 



maximum bounds for the transcoding bit rate, to the TPE 
at time 735. 

Times 740 and 745 denote the times of the decode 
time stamps of frames N and N+1, respectively. 

At time 750, the QLP delivers a transmission bit 
rate to the mux to inform the mux how many packets of 
data in the TPE ' s output buffer to output in a 
transport stream. This time 750 follows the transcode 
time 720 by a delay period. 

4 . Need Parameter 

A bit rate need parameter is determined for each 
frame based on an expected complexity of the frame. An 
transcoding bit rate is allocated to each TPE by the 
QLP 13 0 based on the need parameters and the available 
bandwidth . 

Referring again to FIG. 4, the bits of an input 
frame are first partially decoded by the variable 
length decoder 420, and average quantizer-scales and 
the number of bits in the frame are computed. A number 
of frames, e.g., five frames, of partially decoded 
coefficients and headers are stored for each video 
channel in the lookahead buffer 425, which provides a 
corresponding lookahead delay. The size of the 
processor SDRAM memory 132 limits the length of the 
lookahead buffer. Each coefficient takes two bytes, 
resulting in 720x480x1.5x2 = 1 Mbyte/f rame . 
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At a specific time, Tframestart/ determined by the 
intended decode time of the frame at the target decoder 
452, the need parameter is computed for the oldest 
frame in the lookahead buffer 425. The decode time is 
5 specified by the DTS of the frame, which is in units of 

27MHz clock ticks. Tframestart is defined as: 

(Decode time of the frame - buffer delay - 1.5 
NTSC frame time) . 

The need parameter is computed from the average 
,n 10 quantizer scale and the bit count of the input frames, 

yj as follows: 

^ NeedParameter ~ MbResolutionAdjust * AvgQR * ( 

Iff 

CurrentQR + Alpha * PastQR) / { Beta * CurrentQR + 
PastQR ) , where 

□ 15 AvgQR = (sum of (avglnQuant * inFrameSize) over 

5: the most recent 15 P or B frames and the most recent I 

\n frame in the past) * 900,000 / (DTS of current frame - 

^ DTS of the 16*"^ frame in the past). 900,000 is the 

number of 27 MHz units in one frame period (1/30 sec.) 
20 27 MHz is the MPEG clock rate. 

If there is no I frame within the past, e.g., 45 
frames, the past 16 P or B frames are used. 
For an I frame, 

CurrentQR = avglnQuant * inFrameSize of the 
2 5 current f r ame . 

PastQR = avglnQuant * inFrameSize of the last I 
frame. If there is no I frame within the past 45 
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frames, PastQR is set to be the same value of 
CurrentQR. 

For a P or B frame, 

CurrentQR = average of (avglnQuant * inFrameSize) 
over the current frame and every frame in the lookahead 
buffer 425 of the same picture type. 

PastQR = average of (avglnQuant * inFrameSize) 
over past four frames of the same picture type. If 
there are less than 4 frames of the same picture type 
in the past, PastQR is set to the same value as 
CurrentQR. 

Alpha and Beta are adjustable parameters to 
control the reaction of the need parameter to the 
change in the product of quantizer scale and bit count. 
Default values are Alpha = 256, Beta = 256, 

MbResolutionAdjust is an adjustable parameter to 
compensate the perceptual difference in distortion in 
different resolution. The lower the resolution, the 
more visible the distortion. Therefore the need 
parameter is boosted for lower resolutions. Default 
values of MbResolutionAdjust are 1.0 for full 
resolution, 1.2 for three-quarter resolution, and 1.5 
for half resolution. Alternatively, or in addition, 
the need parameter may be adjusted based on a 
macroblock resolution, which is the number of 
macroblocks in a frame. 
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5. Input Bit Rate Information 

In every Tq time slot, the TPEs 110, 112 and 
Mux 120 count the number of input transport packets and 
save this packet count information in circular buffers 
5 on the QLP 130. There is one circular buffer of input 

bit rate information for each video program/ channel 
processed by the TPEs, and one circular buffer of bit 
rate information for all the data stream that is passed 
directly to the Mux 120 without going through a TPE 

10 (i.e., in the passthrough mode). Each circular buffer 

has, e.g., 1024 entries, and each entry stores the bit 
rate information of one Tq time slot. The 1024 entries 
is just a design parameter that can vary for different 
implementations. The circular buffer should be large 

15 enough to hold the data for the 0.6 sec delay. From 

the packet counts, the QLP 13 0 can calculate the 
instantaneous input bit rate for each Tq time slot as 
follows : 

BitRate (bits per second) = PktCount * 188 * 8 / 

20 TqPeriod. 

The Mux 12 0 counts the number of transport packets 
(except null packets) in each data service, which may 
comprise one or more MPEG programs. The QLP 13 0 uses 
the packet count to compute the instantaneous data 

25 service input bit rate for each Tq time slot. The Mux 

saves the packet count information in circular buffers 
on the QLP in the same way as the packet count 
information from the TPEs is saved. 
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The processes in which the Mux 12 0 and the TPEs 
write packet count information into the QLP's circular 
buffers are asynchronous with the Tq ticks. A Tq index 
which is saved with the packet count information is 
5 used to synchronize the QLP with the input packet count 

information during the initialization process. 

The Tq index is maintained by the QLP. The QLP 
sets the Tq index to 0 at initialization, and increases 
^ it by 1 on every Tq interrupt. The QLP periodically 

■iU 10 broadcast the Tq index and the associated time to the 

^ TPEs 110, 112 and the Mux 120. 

''O During the transcoding bit rate allocation 

in 

process, the QLP 130 sets aside the bandwidth for the 
"■^ pure passthrough video channel (s) and the non- video 

□ 15 channels. Since the transmission bit rate of the 

packets in these passthrough channels has to match the 
^ bit rate of the corresponding packets at the input, the 

P bit rate to set aside for each passthrough video 

channel equals the instantaneous input bit rate at time 
2 0 PacketCountDelay = PassThroughDelay - 

Tc r ToTxr De 1 ay 

prior to the current time, where PassThroughDelay 

is the delay of the packets in the video passthrough 

channels (from demux 410 to remux 450) , which is fixed 
25 at (0.5 sec. + 6 NTSC frame periods) = 0.7 sec. in the 

example implementation. The non- video PIDs have the 

same amount of delay. 
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TcrToTxrDelay is the delay from the calculation of 
the transcoding rate (current Tq tick) to the 
implementation of the transmission bit rate (FIG. 7) . 
This delay is fixed at (O.Ssec + 1.5 NTSC frame periods 
5 +1.5 NTSC frame periods) = 0.6 sec. 

Therefore, PacketCountDelay is a constant equal to 
0.7 sec - 0.6 sec = 0.1 sec. The number of Tq -ticks 
equivalent to this delay is: PacketCountDelaylndex = 
(PacketCountDelay / TqPeriod) . 

10 The QLP 13 0 synchronizes the input packet count 

information with the current Tq interrupt as follows. 

For each circular buffer, the QLP maintains a 10 
bit read pointer. Initially, the QLP searches for the 
entry in the circular buffer whose tqindex matches the 

15 value of (CurrentTqIndex - PacketCountDelaylndex) . For 

every Tq tick after that, the QLP increases the value 
of the read pointer by one. The QLP also checks the 
continuity of the Tqindex stored with the packet count 
in the circular buffer. If there is a discontinuity, 

20 the QLP sets a warning flag to the KP 14 0, and re- 

initializes the read pointer by searching for the 
Tqindex that matches (CurrentTqIndex - 
PacketCountDelaylndex) . 

For every input frame, the QLP calculates the 

25 average input bit rate over a frame. This computation 

is performed at the same time as the frame's need 
parameter calculation. The average input bit rate is 
used for the calculation of the target frame size. 
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First, the QLP computes the number of integer Tq 
periods straddled by the frame: 

FrameTqCount = (difference between the decode time 
of the next frame and decode time of current frame) / 
TqPeriod, rounding to the next higher integer. 

The QLP computes the duration of the frame from 
F r ame TqCoun t : 

FrameDuration = FrameTqCount * TqPeriod. 

Then, the QLP computes the average input bit rate: 

AvglnBitrate = InPacketCount * 188 * 8 * / 
FrameDurat ion , 

where InPacketCount is the sum of PacketCount over 
FrameTqCount entries of the video packet count circular 
buffer, starting from the current read pointer. 

6. Bandwidth Allocation 

At every Tq slot, the QLP performs the bandwidth 
allocation procedure. The QLP first assigns the 
bandwidth to the pure passthrough video programs, and 
to the data and audio programs, which are not 
transcoded. The remaining bandwidth is then allocated 
to the remaining channels based on the values of their 
need parameters, and subject to the maximum and minimum 
bit rate constraints. 

6.1. Passthrough video and data channels 

The QLP 13 0 assigns the transcoding bit rate to 
the pure passthrough channels as follows, 
if ( purePassThrough ) 
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TcodeBitrate = VideoInBitrate 
where VideoInBitrate is the instantaneous input video 
bit rate computed as: 

VideoInBitrate = (PacketCount value stored in the 
5 corresponding video program circular buffer entry at 

the current read pointer ) * 188 * 8 / TqPeriod. 

For each statmux group, the QLP calculates the 
amount of bandwidth that is available for dynamic 
^ allocation, that is, the amount of bandwidth available 

ifl 10 after deducting the bandwidth of the pure passthrough 

%^ channels and the PES alignment overhead bits, A stat 

S remux group refers to a group of channels at the 

m 

transcoder 100 that are competing for bandwidth with 
^ one another. One or more stat remux groups may be used 

O 15 at the transcoder 100. 

i1 AvailableVideoBitrate = TotalOutputBandwidth - 

(sum of NonVideoInBitrate over all channels) - (sum of 
p TcodeBitrate over all pure passthrough channels) - 

(Number of channels that are not pure passthrough * 
20 PesOverheadBitrate) . 

In this equation, TotalOutputBandwidth is the 
total output transport (payload) bandwith available for 
video, audio, and data services in the input streams, 
including system information. This is a user- 
25 configured parameter for the statmux group. 

PesOverheadBitrate is the average overhead bit 
rate for PES alignment, which is a constant: 

PesOverheadBitrate = * 184 * 8 * 30 = 22.08Kbps. 
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The instantaneous non-video bit rate 
(NonVideoInBitrate) is compute in a similar way as the 
VideoInBitrate : 

NonVideoInBitrate = (PacketCount value stored in 
5 the corresponding non-video PID's circular buffer entry 

at the current read pointer ) * 188 * 8 / TqPeriod. 
6.2. Transcoding bit rate allocation 

For each statmux group, the QLP allocates the 
_ AvailableVideoBitrate among the non- pass through video 

^ 10 channels subject to the following constraints: 

1. The sum of transcoding bit rates = GroupBandwidth. 
Since the bandwidth available for dynamic allocation 

r; is variable, and subject to the bandwidth occupied by 

the passthrough components (e.g., non-video data) in 
O 15 the transport stream, the group bandwidth is 

?t expressed as a percentage of the total available 

yl bandwidth when there is more than one statmux group 

S 

Q configured for the output transport multiplex. 

2. The sum of the average transcoding bit rate for all 
20 non-pure -passthrough video channels on any single TPE 

has to be less than an upper bound that is determined 
by the Variable Length Encoder's (380, 440) maximum 
throughput on the TPE. 

3. For a pure passthrough channel, the output bitrate 

25 should be equal to the input bit rate. A channel may 

be processed as a pure passthrough channel, e.g., to 
preserve its quality. 



4. For any video channel, the output target frame size 
cannot be bigger than the input frame size. This 
translates to the constraint that the average 
transcoding bit rate cannot exceed the average input 
bit rate. 

5. The target frame size cannot be higher than a maximum 
value, nor lower than a minimum value, which are 
provisioned to protect the video buffers. 

The procedure of transcoding bit rate allocation 
is outlined as follows. 

6.2.1. Compute an approximation of the maximum frame 
size 

The maximum transcoded frame size to protect the 
decoder buffer from underflow is given by: 

maxFrameSize = (number of bits transmitted to the 
decoder 452 from the time the first bit of the 
transcoded frame enters the transcoder FIFO 445 to the 
decode time of the frame) - (transcoder FIFO level at 
the time the first bit of the transcoded frame enter 
the FIFO) . 

However, the transcoder FIFO level at the time the 
first bit of the transcoded frame enters the FIFO is 
not known at the time the transcoding bit rate is 
calculated. Therefore, an approximation of the maximum 
transcoded frame size is calculated as follows: 

maxFrameSizeEstimate = delayBitsMax - FifoLevel - 
of f setBitsMax. 
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The value of delayBitsMax is the number of bits 
transmitted to the decoder 452 from the last time the 
FIFO level was read to the decode time of the frame, 
and is calculated by: 
5 delayBitsMax = TqPeriod * sum of transmission bit 

rate values in the transmission bit rate queue for 
Ndelay terms starting from FrameMarker, where: 

Ndelay = Number of Tq slots counting from the time 
when the FifoLevel is read to the time when the frame 
10 is decoded. 

The value of FifoLevel is the most recent output 
FIFO level of the transcoder. 

The value of of f setBitsMax is the approximate 
number of bits entering the transcoder FIFO from the 
p 15 time the FIFO level was last read to the time the first 

bit of the target transcoded frame enters the FIFO, 
m This approximation is given by the sum of the initial 

(unbounded) target frame sizes of the frames waiting to 
be transcoded. This is equal to: 
2 0 of f setBitsMax = Size of the most recent output 

frame + target frame size of the frame currently being 
transcoded + sum of target frame sizes of the frames 
preceding the current frame that are waiting to be 
transcoded. 

25 In the approximation of the maximum frame size, it 

is assumed that the number of bits generated by the 
future transcoded frames meets the frame target, and 
the initial frame target values in the QLP's output 



s 
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queue 132 do not hit the maximum frames size nor the 
minimum frame size. 

6.2.2. Compute an estimate of the minimum frame size 

The minimum transcoded frame size to protect the 
5 decoder 452 from overflow is given by: 

MinFrameSize = (number of bits transmitted to the 
decoder from the time the first bit of the transcoded 
frame enters the transcoder FIFO to the decode time of 
^ the next frame) - (Size of the decoder's buffer) - 

^ 10 (transcoder FIFO level at the time the first bit of the 

^ transcoded frame enters the FIFO) . 

^ It can be show that MinFrameSize is related to 

T! MaxFrameSize by: 

MinFrameSize = MaxFrameSize + (Number of bits 

Q 15 transmitted to the decoder from the decode time of the 

• 

,5t current frame to the decode time of the next frame) - 

i ^ 

yl (Size of decoder's buffer) . 

O 

□ Therefore, 

MinFrameSizeEstimate = MaxFrameSizeEstimate + 
20 DeltaBitsMin - DecoderBuf f erSize, 

where DeltaBitsMin = Number of bits transmitted to 
the decoder from the decode time of the current frame 
to the decode time of the next frame, which can be 
calculated by summing the corresponding terms in the 
25 queue of the transmission bit rate. 

In the example implementation, DecoderBuf ferSize 
is the size of the MPEG2 Main Profile, Main Level 
buffer size, which is 1.835Mbits. 
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6.2.3. 



Confute the maximum transcoding bit rate that 
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protects the buffer 

A maximum transcoding bit rate must be set to 
avoid a decoder buffer overflow. The target frame size 
of a frame is computed as the input frame size scaled 
by the ratio of the average transcoding bit rate to the 
average input bit rate. Therefore, the maximum 
transcoding bit rate is calculated as follows from the 
maximum frame size, assuming the transcoding bit rate 
remains constant until the end of the frame time: 

MaxTcodeBitrate = ( ( MaxFrameSize / OrigFrameSize 
) * AvglnBitrate * FrameTqCount - (Sum of transcoding 
bit rate from the beginning of the frame to the current 
Tq interrupt) * FrameTqIndex ) / (FrameTqCount - 
FrameTqIndex) , 

where OrigFrameSize is the number of bits in the 
input frame, FrameTqCount is the number of Tq time 
slots in the frame time, and FrameTqIndex is the number 
of Tq time slots since the start of the frame 

( Tf rameStart ) • 

6.2.4. Compute the minixniom transcoding bit rate that 
protects the buffer 

A minimum transcoding bit rate must be set to 
avoid a decoder buffer underflow. For each video 
service, the minimum transcoding bit rate is computed 
in a manner that is similar to the maximum transcoding 
bit rate: 
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MinTcodeBitrate = ( ( MinFrameSize / OrigFrameSize 
) * AvglnBitrate * FrameTqCount - (Sum of transcoding 
bit rate from the beginning of the frame to the current 
Tq interrupt) * FrameTqIndex ) / (FrameTqCount - 
5 FrameTqIndex) , 

6.2.5. Calculate the maximum aggregated bit rate that 
can be processed by each TPE 

The average output bit rate among all video 
^ services on any single TPE over a window (e.g., 3 frame 

^ 10 periods) is constrained by the processing power of the 

VLE in the TPE, e.g., the throughput is constrained to 
no more than an average of 12 Mbits/sec. spread (a 

Iff 

processor-dependent value) over a 3 frame window. At 
""J any Tq period, the maximum bit rate supported by a TPE 

p 15 is calculated as follows: 

MaxTpeBitrate = (Nxq * VleThroughput ) - (Sum of 

iff transcoding bitrate values of every video channel on 

O 

p the TPE over the past Nxq -1 Tq interrupts) , 

where Nxq = number of Tq time slots in the 
20 averaging window, e.g., 3 NTSC frame time (100ms); 

VleThroughput is the throughput of the VLE in terms of 
average bit rate, e.g., 12Mbits/sec. 

6.2.6. Distribute the available bit rate among the 
video channels 

25 The following procedure applies to each stat remux 

group . 

1. The QLP determines the ideal bandwidth allocation in 
absence of minimum and maximum bitrate constraints. 



NominalBitrate = AvailableVideoBitrate / Number 
of video channels in the statmux group 

TotalNeed = Sum of NeedParameter over every 
video channel 

if (TotalNeed > 0) 

for (every video channel) 

{ 

NeedBitrate [channel] = 
AvailableVideoBitrate * NeedParameter [channel] / 
TotalNeed 

} 

} else { 

for (every video channel) 

NeedBitrate [channel] = NominalBitrate 

} 

Each video channel is assigned the MinTcodeBitrate of 
the channel. If the sum of MinTcodeBitrate exceeds 
the AvailabeVideoBitrate, the bandwidth is 
distributed in proportion to the MinTcodeBitrate. 

TotalMinBitrate = sum of MinTcodeBitrate over 
the statmux group 

if (TotalMinBitrate > AvailableVideoBitrate) 

{ 

for (every video channel) 

{ 

TcodeTcodeBitrate [channel] = 
MinTcodeTcodeBitrate [channel] * 
AvailableVideoBitrate / TotalMinBitrate 
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} 

AvailableVideoBitrate = 0 

Done with transcoding bit rate allocation. 
} else { 

5 for (every video channel) { 

TcodeBitrate [channel] = MinTcodeBitrate 

[channel] 

NeedBitrate [channel] = Max ( 0, 
^ NeedBitrate [channel] - MinTcodeBitrate [channel] ) 

1 10 } 

^ AvailableVideoBitrate = AvailableVideoBitrate 

- TotalMinBitrate 

2 } 

3 . The QLP then tries to satisfy the user minimum bit 
p 15 rate requirement. The QLP bounds the user minimum 

^ bit rate by the MaxTcodeBitrate before applying the 

Ul user minimum bitrate, 

Q for (every video channel) 

{ 

20 if (UserMinBitrate [channel] > 

MaxTcodeBitrate [c] ) 

minBitrate [channel] = 
MaxTcodeB i t r a t e [ channe 1 ] 

else 

25 minBitrate [channel] = 

UserMinBitrate [channel] 

} 
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ExtraMinBitrate = Sum of (minBitrate - 
MinTcodeBitrate) over every channel that 
UserMinBitrate is higher than MinTcodeBitrate. 

if (ExtraMinBitrate > AvailableVideoBitrate) 
5 { 

for (every video channel) 

{ 

if (minBitrate [channel] > 
MinTcodeBitrate [channel] ) 
10 { 

extraBitrate = ( 
minBitrate [channel] - MinTcodeBitrate [channel] ) * 
AvailableVideoBitrate / ExtraMinBitrate 

TcodeBitrate [channel] = 
15 TcodeBitrate [channel] + extraBitrate 

needBitrate [channel] = Max (0, 
needBitrate [channel] - extraBitrate) 

} 

} 

20 AvailableVideoBitrate = 0 

} else { 

for (every video channel) 

{ 

if (minBitrate [channel] > 
25 MinTcodeBitrate [channel] ) 

{ 

extraBitrate = minBitrate [channel] 
- MinTcodeBitrate [channel] 



TcodeBitrate [channel] = 
TcodeBitrate [channel] + extraBitrate 

needBitrate [channel] = Max (0, 
needBitrate [channel] - extraBitrate) 

AvailableVideoBitrate = 
AvailableVideoBitrate - extraBitrate 

} 

} 

} 

. The QLP calculates the a maximum bit rate value for 
each channel based on the user maximum bit rate, the 
maximum and minimum transcoding bit rates to protect 
the decoder buffer, and the maximum processing bit 
rate that can be supported by each TPE. 
for (every TPE) 

{ 

tpeAvailableBitrate = MaxTpeBi t rate [tpe Index] 
- sum of TcodeBitrate over the TPE 

tpeNeedBitrate = sum of NeedBitrate over the 

TPE 

for (every channel processed by the TPE) 
MaxB it rate [channel] = Min { 
MaxTcodeBitrate [channel] , 

(tpeAvailableBitrate * NeedBitrate [channel] / 
tpeNeedBitrate) + TcodeBitrate [channel] , 

Max ( UserMaxBitrate [channel] , 
MinTcodeBitrate [channel] ) ) 

} 
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5. The QLP assigns the remaining bandwidth in proportion 
to the remaining NeedBitrate values . 

TotalNeedBitrate = sum of needBitrate over 
all video channels 
5 for (every video channel) { 

TcodeBitrate [channel] = TcodeBitrate [channel] 
+ (AvailableVideoBitrate * NeedBitrate [channel] / 
TotalNeedBitrate) 

} 

10 AvailableVideoBitrate = 0 

6. The QLP applies the maximum bit rate constraint on 
the bit rate allocation. 

for (every video channel) 

{ 

15 if ( TcodeBitrate [channel] > 

MaxBitrate [channel] ) 

{ 

TcodeBitrate [channel] = 
MaxBitrate [channel] 
20 AvailableVideoBitrate = 

AvailableVideoBitrate + TcodeBitrate [channel] - 
MaxBitrate [channel] 

NeedBitrate [channel] = 0 

} 

25 } 

7. The QLP allocates the extra bandwidth collected from 
the channels that exceed the maximum bit rate. 
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TotalNeedBitrate = Sum of NeedBitrate over every 
channel 

if (AvailableVideoBitrate > 0) 

{ 

5 for (every video channel) 

{ 

extraBitrate = AvailableVideoBitrate * 
NeedBitrate [channel] / TotalNeedBitrate 

if ( extraBitrate + TcodeBitrate [channel] > 
10 MaxBit rate [channel] ) 

extraBitrate = MaxBitrate [channel] - 
TcodeBitrate [channel] 

TcodeBitrate [channel] = TcodeBitrate [channel] 
+ extraBitrate 
15 AvailableVideoBitrate = 

AvailableVideoBitrate - extraBitrate 

} 

} 

8. The QLP allocates the remaining bandwidth in 
20 proportion to the difference between the current 
allocated bit rate and the maximum bit rate, 
if (AvailableVideoBitrate > 0) 

{ 

TotalHeadroom = sum of 
25 (MaxBitrate [channel] - TcodeBitrate [channel] ) over 

every channel 

for (every channel) 

{ 
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TcodeBitrate [channel] = 
TcodeBitrate [channel] + AvailableVideoBitrate * ( 
MaxBitrate [channel] - TcodeBitrate [channel] ) / 
TotalHeadroom 
5 } 

} 

The QLP maintains a queue of the transcoding bit 
rate for each video channel. In each Tq interrupt, the 
calculated transcoding bit rate values are stored in 
10 the queues, and retrieved 0.5 seconds later to use as 

transmission bit rate values. 

6.2.7. Initial Target Frame Size Calculation 

At the last Tq slot of a frame, the QLP calculates 
an initial value for the target frame size as follows. 

15 InitialTargetFrameSize = ( OrigFrameSize * 

AvglnBitrate / AvgTcodeBitrate, 

where AvgTcodeBitrate is the average transcoding 
bit rate for the frame, defined as the sum of 
TcodeBitrate over all Tq slots occupied by the frame, 

20 The TPEs may not be ready to transcode a new frame 

at this time, therefore the QLP maintains a target 
frame size queue for each video channel. The 
InitialTargetFrameSize value is stored in the queue for 
the corresponding channel, and is retrieved later when 

25 the TPE is ready to transcode the frame. 

6.3. Passthrough Decision 

At the first Tq interrupt of a frame, the QLP 
decides whether to pass through a frame or not. The 



pass through decision is made based on the transcoding 
bit rate calculated at the first Tq slot of the frame 
as follows for each channel at the beginning' of a new 
frame . 

PassThroughBitrate = PassThroughMargin * 
OrigFrameSize / FrameTqCount ; 

where PassThroughMargin is a parameter less than 
but close to 1.0, e.g. 0.95; OrigFrameSize is the 
number of bits in the input frame; and FrameTqCount is 
the number of Tq slots in the frame. The use of 
PassThroughMargin allows the input frames whose size is 
slightly higher than the target frame size to be passed 
through, thereby preserving the quality of the frame, 
and also saving transcoder processing cycles. 

if ( TcodeBitrate > PassThroughBitrate ) 

{ 

Pass through the entire frame. 
} else { 

Transcode the frame. 

} 

6.4. Target frame size calculation 

The QLP calculates the maximum frame size and the 
minimum frame size values based on the latest buffer 
level information as soon as it receives a message from 
the TPE that signals the TPE is ready to transcode a 
new frame. The QLP then pulls the target frame size 
out from the target frame size queue 132, and computes 
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the final value of the target frame size using maximum 
and minimum frame size constraints. 

6.4.1. Compute the maximuin frame size 

The QLP calculates the maximum frame size to 
5 protect the decoder buffer from underflow. The 

calculation is similar to that of the approximate 
maximum frame size calculation during the Tq interrupts 

(6.2.1) : 

MaxFrameSize = DelayBits - FifoLevel - 
10 LastOutputFrameSize . 

The value of DelayBits is the number of bits 
transmitted to the decoder from the time the FIFO level 
was read to the decode time of the frame, and can be 
calculated by summing the corresponding transmission 
15 bit rate values currently in the transmission bit rate 
queue . 

The value of FifoLevel is the transcoder FIFO 
level latched by the transcoder. That is, the 
FifoLevel is read by the transcoder and passed to the 
20 QLP. 

6.4.2. Compute the minimum frame size 

The QLP calculates the minimum frame size to 
protect the decoder buffer from underflow. The 
calculation is similar to that of the approximate 
25 minimum frame size calculation during the Tq interrupts 

(6.2.2) . The minimum frame size is related to the 
maximum frame size by: 
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MinFrameSize = MaxFrameSize + (Number of bits 
transmitted to the decoder from decode time of the 
current frame to the decode time of the next frame) - 
(Size of decoder's buffer). 

As mentioned before, for MPEG2 Main Profile Main 
Level, the decoder's buffer size is 1.835 Mbits. 

6.4.3. Compute the carryover from the previous frame 
The transcoders may not be able to generate 

exactly the number of bits equal to the target frame 
size. The surplus or deficit of bits from transcoding 
the previous frame is lumped in with the target frame 
size of the current frame. This deviation (surplus or 
deficit) is calculated as: 

FrameCarryOver = LastOutputFrameSize - 
(TargetFrameSize of previous frame) . 

6.4.4. Compute the Target Frame size 

The QLP pulls the InitialTargetFrameSize value out 
from the target frame size queue of the corresponding 
video channel, and bounds the target frame size by the 
maximum and minimum values : 

TargetFrameSize = Min ( MaxFrameSize, Max ( 
MinFrameSize, InitialTargetFrameSize + FrameCarryOver ) 
) . 

The QLP then sends the values of MinFrameSize, 
MaxFrameSize, and TargetFrameSize to the TPEs . These 
values are used to guide the rate control of the 
transcoding process . 
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6.5. Quantization control 

Within a frame, the following algorithm is used 
for calculating the quantization scale value for every 
macroblock to be transcoded. A new quantization scale 
5 QNew is calculated by scaling the quantization scale of 
the input macroblock Qoid by a targeted bit reduction 
ratio, RNew/Roid- Typically, each macroblock has a 
quantizer scale. However, a group of macroblocks, such 
as in a slice or other grouping, may be associated with 
10 a common quantizer scale. In this case, a new 

quantization scale is determined for the group. 
Initialization: 

Roid = Original number of bits in the frame. 
Rjjew = TargetFrameSize = Target number of bits 
15 to be generated by transcoding/requantizing the frame. 

For every slice, do: 

{ 

Qncw = Qoid * Roid / Rncw 

/* Update Roid and Rwew after requantizing a slice: 

20 */ 

Roid = Roid - original number of bits in the slice. 

Rncw = Rngw - new number of bits generated by 
transcoding (e.g., including requantizing) the slice. 

} 

25 For QNew/ rounding to the next higher integer, or 

to the closest integer, may be used. The above formula 
should result in the frame being transcoded to the 
target frame size. The transcoded frame size may go 



over the target, but it should not exceed the maximum 
frame size. 

However, if the maximum frame size is reached very 

early in the frame, which can happen, e.g., if the 

quantization scale at the beginning of the frame is 

low, thereby generating a lot of bits, the quantizer 

scale is set to the maximum level (coarsest quantizing) 

and the rest of the frame will consequently have very 

poor quality. To avoid this, a minimum number of bits 

per macroblock, inb_budget, are allocated. As the frame 

is transcoded, if a running count of the number of bits 

used grows too large, i.e., the number of bits used 

reaches a certain level, which is adjusted as 

requantization of the frame progresses, a panic 

quantizer is set for a short time until there are 

enough bits left for the remaining macroblocks to have 

j7Lb_jbudget number of bits. That is, the panic_level is 

a quantizer level to try to force the MBs to have 

^mb_budget or smaller number of bits. This spreads the 

panic quantizer over the frame, such that only a 

portion of the frame may go into the panic mode. To 

achieve this, at the beginning of each frame, 

initialize the following variables: 

, , , T dLXgetFrameSize ^ 

mb_budget g 

number _mbs 

panic _level = MaxFrameSize - mb_budget * number _mbs 
^ determines the minimum number of bits allocated 
to each macroblock as a fraction of the average number 
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of bits per macroblock using the frame target size, To. 
The range is 0<^<1. A ^ of M-1/2 may be suitable for 
most cases. If ^ is too big, the panic condition may 
be triggered too early; if ^ is zero, then the panic 
condition may trigger too late, whereby the rest of the 
frame is stuck in panic mode. 

After each macroblock is coded, 

panic _ level = panic _ level - bits _ used _mb-\-mb_ budget 

if (panic_level < 0) 
QMew = MAX_QL; 

where MAX_QL = 112 (e.g., the applicable maximum 
QL for the system) . 

If the frame size is less than the minimum frame 
size, zeros are appended to the end of the bitstream, 
such that the frame size is equal to, or greater than, 
the minimum frame size. 
6.6. PGR slot 

The MPEG standard requires the PGR (Program Clock 
Reference) to be sent at a maximum interval of 100ms. 
The actual PGR value is not known until the 
transmission time, so the transcoder creates a 
placeholder slot for the PGR. 

From the target frame size, the QLP estimates the 
time used for transmitting the frame, hence the minimum 
number of PGRs required to be inserted in the frame to 
satisfy the maximum PGR interval requirement. In 
satisfying this requirement, note that while uncoded 
pictures have constant duration (1/3 0 sec) , coded 
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bitstreams may have a variable duration for each frame. 
For example, if a frame has 100,000 bits and is 
transmitted at 1 Mbps, the duration is 0.1 sec. If the 
frame is transmitted at 2Mbps, then the duration is 
0.05 sec. The amount of time required to transmit the 
frame (or, more precisely, the time lapse from the time 
the first bit of the frame leaves the transcoder's 
output buffer (FIFO) 445 to the time the last bit of 
the frame leaves the FIFO) is estimated as: 

TxFrameDuration = TargetFrameSize / (minimum value 
in the transmission bit rate queue) . 

The minimum number of PCRs to insert during the 
frame is: 

MinPcrCount = TxFrameDuration / MaxPcr Separation, 
round up to the nearest integer, 

where MaxPcrSeparation is the maximum separation 
between PCRs as required by MPEG (100 ms) . The value 
of MaxPcrSeparation=80 ms is used to provide a 20 ms 
margin. 

Accordingly, it can be seen that the present 
invention provides an efficient statistical 
remultiplexer for processing data in a number of 
channels that include video data. In one aspect of the 
invention, transcoding of the video data is delayed 
while statistical information is obtained from the 
data. Bit rate need parameters for the data are 
determined based on the statistical information, and 
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the video data is transcoded based on the respective 
bit rate need parameters following the delaying. 

In another aspect of the invention, a transcoding 
bit rate for video frames at the stat remxix is updated 
5 a plurality of times at successive intervals to allow a 

closer monitoring of the bit rate. Moreover, minimum 
and maximum bounds for the transcoding bit rate are 
updated in each interval. Thus, a portion of a frame 
^ is transcoded in a first interval, then the transcoding 

iH 10 bit rate is updated, then a second portion of the frame 

is transcoded in a second interval, and so forth. 
^ In yet another aspect of the invention, the pre- 

transcoding quantization scales of the macroblocks in a 
frame are scaled to provide corresponding new 
p 15 quantization scales for transcoding based on a ratio of 

?t a pre -transcoding amount of data in the frame and a 

Ul target, post-transcoding amount of data for the frame. 

P Moreover, the quantization scales are adjusted for 

different portions of the frame as the portions are 
20 transcoded to ensure that a minimum amount of 

transcoding bandwidth is allocated to each macroblock. 

Although the invention has been described in 
connection with various preferred embodiments, it 
should be appreciated that various modifications and 
' 25 adaptations may be made thereto without departing from 

the scope of the invention as set forth in the claims. 



